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Abstract 

In a case-control study aimed at localizing disease variants, association between a marker 
and the disease status is often tested by comparing the marker allele frequencies among cases 
and controls. These marker allele frequencies are expected to be different if the marker is 
associated with the disease. The power of the commonly used allele based test is based on 
the marker allele frequency; markers with a low minor allele frequency have less power to 
be detected (if they are associated with the disease), than markers with high minor allele 
frequency. Therefore the strategy of selecting markers for follow-up study based on their 
p-values, favors markers with a high minor allele frequency. 

We propose an allele based test that does not have this (unwanted) property and is there¬ 
fore more powerful for markers with a low minor allele frequency. This test may, therefore, 
be more effective when searching for rare causal variants. The asymptotic power function of 
the test is derived and simulation studies are performed for finite sample properties of the 
test. Next, the existing and the proposed tests are applied to data; this is not included yet. 

In the light of the current interest in detecting association between complex phenotypes 
and causal variants with a low minor allele frequencies, this test is expected to be of relevance. 

Case-control study, allele based test, linkage disequilibrium (LD), power, p-values, minor allele 
frequency (MAF) 
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1 Introduction 


To locate disease variants and dichotomous trait loci, association studies of genetic markers are 
often conducted with a case-control design. Several genetic association tests have been proposed 
(see e.g. mm)- One simple and perhaps one of the most natural test, is the singular marker 
allele based test that is based on the difference of the sample marker allele frequencies in cases 
and controls. 

A marker is associated with the disease or trait if it is in linkage disequilibrium (LD) with 
one of the causal variants (see e.g. £21). On average, the larger the degree of LD, the smaller 
the p-value of the association test. However, the p-value obtained with the (commonly used) 
singular marker allele based test is also a function of the marker allele frequency; markers with 
high minor allele frequencies (MAFs) have more power to be detected than markers with low 
MAFs. The strategy of selecting interesting markers for follow-up study based on their p-values 
only, is therefore biased towards markers with high MAFs. 

Several other strategies for prioritizing the markers for follow-up studies have been proposed 
and compared, like ranking markers based on the Bayes Factor signal ([121 113] ). the likelihood 
ratio signal ([9)), frequentist factor signal am), and PrPES signals (0). 0 compared these 
strategies, including ranking markers based on the p-values of the allele based test and Cochran- 
Armitage-trend test, by applying them to two data-sets. The markers with the smallest p-values 
obtained from the allele based test are also top-ranked by the other methods. Some strategies 
down-weight markers with small MAF even more than the allele based test does. 

In this paper we propose an allele based test for testing association in case-control studies 
that does not favor markers with high minor allele frequencies. It is shown that the test has 
more power than the commonly used allele based test if the minor allele frequency of the marker 
is quite low. The proposed test-statistic is found by standardizing the difference of sample allele 
frequencies in a different way than the commonly allele based test does. An explicit asymptotic 
power function is derived and finite sample properties are obtained by simulation studies. The 
test will be applied to data. 

“Missing” heritability refers to the fact that for many traits, only a small proportion of the 
variability in the population can be explained by causal variants that have been identified up 
to now (see e.g. 0 ). One possible explanation for this “missing” heritability is the presence 
of low-frequency variants of relatively strong effect on disease risk. Indeed, rare variants found 
by resequencing have already been described to affect complex diseases (£])• In the light of 
the current interest in detecting association between complex phenotypes and low-frequency 
variants and localizing causal variants with small minor allele frequencies, the implications of 
the present paper are expected to be of relevance. 

2 Methods 

2.1 Setting 

The case-control status for a random individual in the general population is denoted by X , 
X = 1 for a case and X = 0 for a control. The fractions of cases and controls in the total 
population are denoted by it = P(X = 1) and 1 — it = P(X = 0). There may be multiple 
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causal variants; the aim in association studies is to locate these variants. We initially assume 
that any given marker is in linkage disequilibrium with at most one causal variant. Therefore, 
it is sufficient to consider the situation where there is only one causal variant. Suppose the 
causal variant is biallelic with alleles A\ and A 2 and with corresponding allele frequencies p\ 
and P 2 = 1 — Pi in the total population. In case the causal variant is not biallelic, the second 
allele, A 2 , could be regarded as all alleles which are not the Ai-allele. We denote the fraction 
of individuals with the disease (X = 1) among those individuals with genotype (Ai,Aj) at the 
causal variant by Ti t j = P(X = 1 \AiAj) for i,j = 1,2 and we assume that ni 2 = vr '21 . 

We consider a biallelic marker which may or may not be in proximity with the causal variant. 
One of the marker alleles is denoted as M\ and the other allele as M 2 . The allele frequencies 
for M\ and M 2 in the general population are denoted as q\ and q 2 = 1 — qi and the Mj allele 
frequencies among the controls and cases are denoted as gu 0 = Qi\x=o and qi\i = qi\x=i f° r 
i = 1,2. 

A common measure for the degree of LD between a marker and a causal variant is given 
by the quantity A ij = Dij / y/qXP 2 QiQ 2 , with Dij = P(A,;Mj) — Piqj where P(AjM ? ) is the 
( Ai,Mj ) haplotype-frequency in the total population (see for instance 0). By definition A ^ = 
cor(l Ai-, 1 Mj) with, for a randomly chosen haplotype, 1 and 1 m, the indicator functions which 
equal 1 if the causal variant is Ai and 0 otherwise, and similar for the marker allele Mj. In the 
Appendix A in the Supplementary Material it is derived that 

gijo ~ jjji _ a gijo ~Pi|i m 

1 JP 1 P 2 

with A = An. So, the relative difference between the allele frequencies among the controls and 
cases at the causal variant (the quotient on the right hand side of the expression) is passed on 
to the neighboring markers by multiplying this relative difference by A, the degree of linkage 
disequilibrium between the alleles at the marker and the causal variant. From this formula it 
can be directly seen that the M\ allele frequencies among the controls and cases equal if the 
marker is in linkage equilibrium with the causal variant, i.e. A = 0. 

In order to find markers that are associated with the disease or, actually, are in linkage 
disequilibrium with the causal variant, case-control data is collected. Suppose we have a sample 
of R individuals from the cases and S individuals from the controls; R and S are fixed and 
non-random. Their sum is denoted as N = S + R. Since every genotype consists of two alleles, 
there are actually 2 R and 2 S alleles from the cases and controls, respectively. The number 
of Mi alleles among the cases and controls are denoted as R\ and Si, respectively. For the 
number of M 2 alleles, the notation is analogues: R 2 and S 2 for the cases and controls. Note 
that R\ + Ra = 2 R and Si + S 2 = 2S. Based on these data the fraction of Mi alleles among the 
controls and the cases can be estimated as gu 0 = Si/(2S) and qpi = R\/{2R). 

In the following two allele based association tests are described for testing the null hypothesis 
Ho : <?i | 0 = giu against the alternative Hi : gi | 0 7 ^ gui- The first test we describe, is the 
commonly used test. This test is based on the difference gpo — gui, standardized so that the 
test-statistic is asymptotically standard normal distributed under the null hypothesis of no 
association. Calculations will show that by the way of standardizing, markers with a low minor 
allele frequency have less power to be detected under the alternative hypothesis, than markers 
with a high minor allele frequency. The test we propose is also based on the difference gqo — gm, 
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but standardized in a different way. In Section 3 the power of the two tests are compared. 


2.2 Commonly used allele based association test 

The commonly used allele based test-statistic for testing the null hypothesis Ho : q^o = Q\\\ 
against Hi : q 1][Q ^ qi|i is given by 


T ^llo - 9i|i 


with q!| 0 = Si/(2S) and q^i = R\/{2R). Furthermore, V is an estimator of the variance 
V = Var (qi|o — <?i|i) = Vo + hi, with Vq and Vi defined as 


Vo = Var q!| 0 


9i|o( 1 - 9i|o) 
2 S 


Vi = Var q,|! 


9i|i(l - 9i|i) 
2 R 


( 2 ) 


The unknown frequencies q^o and q^ in Vo and Vi are, usually, estimated by q^Q and q^, so 
that V is asymptotically unbiased under the null and alternative hypothesis. For A = R/N the 
fraction of sampled cases, the variance V equals 


v = v 0 + v 1 = = m -.( A9l| „, 2|0 + (1 _ a ) 9i|i92|1 ) 


for m = 2N\(1 — A). For large sample sizes, the test-statistic T has, approximately, a normal 
distribution with mean 


9i|o - 9i|i 


\J -^9l|o92|0 + (1 — -^)9l|l92|l 


= \/rnQBA 


and variance 1, where 


(3) 


B = Pi|Q ~Pi|i 

y/PlP2 


and 


q2 = _9192_ 

-^9i|o 92|0 + (1 — ^)<Zl|l92|l 


(4) 


The equality in ^ follows from |T]). Under the null hypothesis that q!| 0 = qqi (i.e. A = 0), T 
has, asymptotically, a standard normal distribution, whence Ho is rejected for \T\ > z a / 2 , with 
z a j 2 the upper-a/2-quantile of the standard normal distribution. The two-sided power-function 
of the test (approximately) equals, for large sample sizes, 


P(|T| > z a/2 ) = 1 - <1 >{z a/2 - y/mBAQ) + ®(-z a/2 - y/mBAQ). (5) 

The power of the test is controlled by the product y/mBAQ. The first term, y/m, is specific for 
the way of sampling and is, therefore, equal for every marker. The second term, B, is specific 
for the causal variant and is equal for all markers that are in linkage disequilibrium with the 
same causal variant. The third term, A, measures the degree of LD between the marker and the 
causal variant and its value varies over the markers. The last term, Q, depends on the marker. 
Under the null hypothesis that a marker is not associated with the disease Q = 1, but under 
the alternative Q is either smaller or larger than 1, depending on the (conditional) marker allele 
frequencies. That means that the markers in the association study are weighted; markers with 
Q > 1 do get more power to be detected than markers with Q < 1. 
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2.3 Allele based test for markers with low MAF 


Define the test-statistic 


W = 


y/m 


<?i|o - 9i|i 


(6) 


with m = 2NX(1 — A) (as before) and q\ = Trqi\\ + (1 — vr)^^ and q 2 = 7rg 2 |i + (1 — 7r)g 2 | 0 , where 
ft is an estimate of the disease prevalence P(A = 1). This estimate cannot be obtained from 
the samples of cases and controls, but should be estimated based on external data. For many 
diseases, an estimate of the population prevalence n is available, for instance in the literature 
or in national registries. 

The test-statistics W and T are related via the relationship W = TQ^ 1 , with Q an estimate 
of Q in Q; it is found by inserting the estimates q\ and q 2 (as just defined) and the marker 
sample frequencies among cases and controls. By the law of large numbers, Q approximately 
equals Q for large samples and from Slutsky’s lemma and the continuous mapping theorem (see 
e.g. [13) it follows that W has, approximately, a normal distribution with mean 


/— 9 l |0 ^?111 /— R a 

ym — 1 -— = yjmBA, 

\/QlQ2 

and variance Q~ 2 . Under the null hypothesis, gqo = gi|i,A = 0 and Q = 1 and W has, 
asymptotically, a standard normal distribution, whence Hq is rejected for \W\ > z a i 2 for z a / 2 
the upper a quantile of the standard normal distribution. Under the null hypothesis q\ = 
7 T< 7 i|i + (1 — 7r)gi | 0 ~ vrgi + (1 — 7 r)gi = qi, no matter the value of 7 r. That means that, even 
if the estimate of n is far away from the true value, the type I error of the test will still be 
approximately correct if the sample sizes are large enough. The two-sided power function for W 
is given by 


P(\W\>z a/2 ) « P(\T\>z a/2 Q) 

= 1 - $(z a / 2 Q — VmBAQ) + ®(—z a / 2 Q — VmBAQ). (7) 

The power functions for W and T are very similar, but differ in the way Q is the expression. 
In the power function for W, also the quantile z a /2 is multiplied with Q. If Q = 1 (i.e. under 
H 0 ) the power functions equal (to a). Under the alternative hypothesis either Q > 1 or Q < 1. 
If the sample size is large, this will probably also hold for Q (since Q converges in probability 
to Q if also 7r converges in probability to n). From the definitions of T and W, it follows that 
\W\ > |T| if Q < 1 and \W\ < |T| if Q > 1; if Q < 1 the test based on W is more powerful, 
whereas the test based on T is more powerful if Q > 1. 

In Appendix B in the Supplementary Materials it is shown that, if M\ is positively correlated 
with the risk allele ( A\ or A 2 ) and q\ is sufficiently small, Q will be smaller than 1. However, if 
M\ is negatively correlated with the risk allele, Q > 1. If the risk allele is the minor allele and 
qi is small, strong negative correlation with the risk allele are not possible within the parameter 
space of the genetic model. The latter can be easily seen from the following. Remind that 
A = (P(Ai, Mi) - piqi)/^pip^q^. Consequently, A > -piqi/y/piP 2 qiQ 2 - If Pi = Qi = 0.05, 
A > —0.053, and pi = 0.25, q\ = 0.05 yields A > —0.12 and pi = qi = 0.25, yields A > —0.26. 
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2.4 Generalization of the test-statistic: W$ 

The test-statistic W can be generalized by allowing other values in stead of n. Define the 
test-statistic 

For 5 = if, this statistic W% equals W and, furthermore, Ws = TQJ 1 with Q$ an estimate for 
Qs with Q 2 = gig 2 /(5gq 0 g2|o + (1 — £)<Zi|i< 72 |i) (similar to Q 2 as defined before). The estimate 
of Qs is found by inserting estimates of the allele frequencies, like is done for Q. 

Under the null hypothesis that gq 0 = gqi and the denominator of Ws converges in probability 
to y/qiq 2 , notwithstanding the value of 5. That means that under the null hypothesis, the test- 
statistic Ws has, approximately, a standard normal distribution, for large sample sizes. So, the 
null hypothesis of no association is rejected if \Ws\ > z a i 2 . 

If 5 = tt, the denominator converges to Qq\q 2 under the alternative hypothesis. So, only in 
that case the mean of the test-statistic approximates y/rnB A for large sample sizes. The test 
is most powerful if, under the alternative hypothesis, the denominator is minimized. If M\ is 
the minor allele and positively correlated with the disease and, thus gq 0 < gqi < 0.5, the test 
is optimal for 5 = 0 and least optimal for 5 = 1. If 0.5 > gqo > gqi (the minor allele M\ is 
negatively correlated with the disease), it is the other way around. For 0 < 5 < 1, the power is 
in between the minimum and maximum. In practice it is unknown whether the minor allele M\ 
is positively or negatively correlated with the disease. 

2.5 Combined test of W and T 

The two test-statistics W and T are linked via T = WQ. If Q > 1, the test with test-statistic T 
is more powerful, whereas the opposite holds if Q < 1. If Q = 1 then T = W and the two tests 
are equivalent. To have best of both of tests, the two tests could be combined. Define 

v = T 1 0>1 + M/ lo<! = WQl 0>1 + wi i£1 ( 9 ) 

as the combined test-statistic. By combining the law of large numbers, Slutsky’s lemma and the 
continuous mapping theorem (see e.g. im U has asymptotically a standard normal distribution 
under the null hypothesis of no association; the null hypothesis is rejected for |J7| > z a / 2 ■ The 
asymptotic power function for U equals the one of W if Q < 1 and of T if Q > 1. 

In a similar way the test-statistics Ws and T could be combined. 

2.6 Multiple causal variants 

Suppose there are multiple causal variants, but every marker is in linkage disequilibrium with 
at most one causal variant. The value of B is specific for the causal variant and will, therefore, 
be the same for all markers which are in linkage disequilibrium with this causal variant. Since 
the power function (and the p-value) depends on the value of B, markers can only be ranked on 
their p-values locally; for all markers which are in linkage disequilibrium with the same causal 
variant. In practice it is unknown whether there is only one or multiple causal variants. One 
should be careful when comparing p-values for markers at different chromosomes or located far 
apart. 


V™(Qi\o ~Qi\i 


\J (<5<7i|i + (1 — ^)'?i|o)(^ ( ?2|i + (1 — <5)<72|o) 


( 8 ) 
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2.7 Measuring the effect size 

For the markers that show significant association with the disease status, the effect size is of 
interest. When testing based on the test-statistics T or W, the difference between the allele 
frequencies among cases and controls, gqo — could be used as a measure of effect, but this 
difference is difficult to interpret. In practice the Cochran-Armitage test or the test-statistic T, 
is often used for testing association, whereas an effect size in terms of odds ratios is estimated by 
fitting a logistic regression model am)- It would be more natural to use the same statistic for 
testing association and estimating an effect size. This can be done within a logistic regression 
model, but also based on the test-statistics W or T. The effect size defined as 

P(X = 0|Mi) /P(X = 0) 

P(X = l|Mi)/ P(X = 1)’ 

can be estimated and a corresponding confidence interval can be constructed. This can be seen 
by writing this fraction as (using Bayes theorem) qi\o/q\\i = 1 + (qi|o — qi\i)/q\\i and noting 
that Q] 11 is consistent by the law of large numbers. This quantity is not based on any model 
assumptions, like in a logistic regression model, and is therefore very appropriate for quantifying 
the marker effect size. 

When determining a quantity for measuring an effect size, it is important to keep in mind 
what the aim of the study is. In case one aims to estimate the disease risk based on the observed 
genotypes at markers which are possibly not the causal variant, odds ratios for different markers 
would be appropriate. However, in case one aims to find the causal variant, one may prefer to 
use a measure that tries to quantify the distance between markers and a causal variant, or at 
least orders the markers with respect to their distance to the causal variant. Of course, it is not 
possible to measure the physical distance between markers and a causal variant, but, locally, 
the markers can be ordered with respect to the degree of linkage disequilibrium with a causal 
variant. In the previous subsection we have seen that W/^/m ~ BA, where B is equal for all 
markers that are in LD with the same causal variant. That means that ranking the markers (in 
a small region of the chromosome) based on the test-statistic \ W\, is equivalent to ranking them 
based on (an estimate of) the degree the marker is in LD with the causal variant (A). Although, 
it is known that A between markers and a causal variant is not necessarily a monotone function 
with the physical distance between the two, it is expected that there is a positive relationship 
and the causal variant will be located nearby the markers which are strongest in LD with the 
causal variant. 

3 Results 

This section is divided into two subsections. In the first subsection the asymptotic power func¬ 
tions of the two tests are compared. In practice, the sample sizes are finite. In the second 
subsection we perform simulation to study the type I error for finite samples, and the effect of 
the extra variability due to the fact that ir is estimated. 
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3.1 Comparison of the asymptotic power functions 


We define the Ai-allele at the causal variant as the high risk allele and the disease probabilities 
are taken equal to 7rn = 0.60,7r22 = 0.10 and for the additive model tt \2 = (7Tn + 7r22)/2 = 0.35. 

In Figure 1 the asymptotic power function is given as a function of qi (first and second plot) 
and of A (third plot). In all cases R = S = 1000 and a = l.Oe -8 . In the first plot p\ = 0.05 and 
pi = 0.15 in the second plot. In both cases A = 0.3. The power function for the test-statistic 
W is plotted as a continuous line and for T as a dashed line. In the most right plot in Figure 1, 
the power function as a function of A is given. Now, p\ = 0.05 and q\ equals either p\ or 2>p\. 
Again, the dashed lines represent the power for the test with test-statistic T (q\ = p\ lower line, 
qi = 3pi upper line) and the continuous lines for the test with test-statistic W (the two lines 
overlap). For all plots we inserted the true value of ir in the power function, because it will be 
shown below that the power function is robust against misspecification of the parameter ir. 

From the plots it can be concluded that the test based on test-statistic W is more powerful 
than the one based on T for the genetic models described above. The power based on W is 
approximately constant as a function of qi, whereas the power based on T increases with q\. So, 
W does not, a-priori, favors markers with a large minor allele frequency. This makes the p-values 
comparable across markers. This does not hold for T. The power functions were plotted for 
more genetic models, including the dominant and the recessive model. As long as the correlation 
between A± and M\ is positive, the conclusions remain the same. 

If the minor allele is negatively correlated with the causal allele A ], the theory tells us 
that the test based on T is more powerful. We consider the same setting as before. In the left 
plot of Figure 2 the power functions for the two tests are given as a function of qi for A = —0.40 
and pi = 0.60. The power of the test based on T is indeed higher. However, note the low power 
of both tests. In the right plot of Figure 2 the power is plotted as a function of A. Again, the 
test T is more powerful if A < 0. 

Misclassification of it 

The denominator of the test-statistic W contains the parameter ir, the prevalence of the disease, 
which cannot be estimated from the case-control data itself, but has to be estimated based on 
data from a different source. Misspecification of this parameter affects the power of the test. 
Therefore, the power function is considered for different values of the estimate if. So, actually 
the power of the test based on Wg is considered for different values for 5 near 7r. That means 
that misclassification of 7r will never lead to a test with inflated type I error (if the sample size 
is big enough), but to a different test which has a priori a slight preference for markers with 
either small (if if < 7r) or large (if ir > 7r) minor allele frequencies. 

The power function was computed for exactly the same models as was done before. The 
results are given in Figure 3. The power was computed for ir = 0.2 (lowest continuous lines) 
ir = 7r = 0.125 (continuous line in the middle), ir = 0.075 (upper line). Since the power function 
for T does not contain 7r, only one curve is found; the dashed line. More models were considered. 
In all cases similar results were obtained. We conclude that allele based test for testing with the 
test-statistic W is robust against small misspecification of 7r. 




ql, pi =0.05, Delta=0.30 


ql, pi =0.15, Delta=0.30 


Delta, pi =0.05 


Figure 1: Power functions for T (dashed lines) and W (continuous lines). Additive model with: 
7Tn = 0.60,7T22 = 0.10 and 7ri2 = 0.35. 


oo 
o 

o 

I - 

o 

o 
d 

0.15 0.25 0.35 0.45 



ql, Delta=-0.40, pi =0.60 


Figure 2: Power functions for T (dashed lines) and W (continuous lines) in the additive model 
with, left A = —0.40. 
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Power of Wg 

In the previous paragraph we considered the effect of small deviations of 5 near it on the power 
of Ws■ In this paragraph, we consider what happens if <5 runs from 0 to 1. In the Figure 4 the 
asymptotic power of W$ is plotted as a function of qi, for different values of 5. The fat line 
(third line from above) indicates the power function for W = W v with it = 0.125, and the thin 
continuous lines for different values of <5 (5 = 0,0.1,... ,0.9,1.0, from upper to lowest line). It 
can be seen that only for 6 ~ ir the lines are more or less constant; the power is not affected by 
the marker allele frequency. Above the fat line, the power functions are slightly decreasing as a 
function of q \, and below the fat line, the power is increasing. 


3.2 Finite samples 


The power function and the computation of p-values are based on asymptotic normality of the 
test-statistic W. In practice the sample sizes are finite and not infinite. Therefore the normality 
approximation may not be exact and a continuity correction may improve this approximation. 
A correction can be done in multiple ways. We used the following: 


lirn?' — 


gijo - <? 1|1 ± | min{5, Rj/ (2 SR) 

VM2 


( 10 ) 


where the “maximum” could be replaced by the “minimum” if one prefers to correct less. The 
null distribution of the test-statistic is asymptotically standard normal. In practice the number 
of observations is not infinite and the finite sample distribution may be different from the 
asymptotic distribution. We performed a simulation study to check the type I error for the 
three tests. Like before we take p\ = 0.10, 7Tn = 0.60 and 7^2 = 0.10 in the additive model. 
Since we do the simulation under the null hypothesis A = 0. We simulate 100,000,000 times 
R cases and S controls from the model given in Table 1, compute the test-statistic T, W and 
W cor , the corresponding p-values, and compute an estimate of the type I error as the fraction 
of p-values smaller than a. The results are given in Table 1. 

From the table it can be seen that the type I error for the test based on W is slightly inflated 
especially if the sample sizes are small and the maf of the marker is low. This inflation disappears 


if the sample size or the maf grows. After continuity correction as described in (10) there seems 
to be still an inflation, but this is smaller already. 

In the previous example the value of 7r is 0.15. If the prevalence of the disease in the 
population is lower, the sample size is small (around 500 cases and 500 controls) and the MAF 
of the marker is also quite low (0.10 or lower), the type I error inflation may become unacceptable. 
In that case one could decide to insert a higher value of 5 , what diminish the type I error, but 
decreases the power of test under the alternative hypothesis. 

We also performed several simulation studies under the alternative hypothesis. In all cases 
the power functions based on finite samples were very similar to the asymptotic power function 
(results not shown). 


Combined Test 

We perform several simulation studies to study the performance of the combined test with test- 
statistic U. The results are as expected. For large sample sizes the null distribution is close 
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0.05 0.15 0.25 


ql, Delta=0.40, pi =0.05 


Figure 3: Power function based on statistic W as a function of q\ for different values of n. From 
lowest continuous line to most upper line: 7r = 0.20, n = n, and n = 0.075. The dashed lines 
gives the power for the test based on test-statistic T. 



ql, Delta=0.40, pi =0.05 


Figure 4: Power function based on statistic Ws as a function of q± for different values of 5. 
Dashed lines: power based on T. Fat line (third lines from above): power based on W = W n . 
The other lines are the power functions based on W$ for 5 = 0,0.1,... ,0.9,1.0, from upper to 
lowest line. 
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qi = 0.10 

qi = 0.25 

a 

1 X 10" 3 

1 x 10 -4 

1 x 10~ 5 

1 x 10 -3 

1 x 10" 4 

1 x 10" 5 


test 

Xl0" 3 

xlO -4 

xlO -5 

xlO" 3 

xlO" 4 

xlO" 5 

R = S = 500 





T 

1.00124 

0.9698 

1.006 

1.02640 

1.0378 

1.118 

6 = TT 

W cor 

1.12314 

1.5258 

2.324 

0.97097 

1.0419 

1.175 


w 

1.27542 

1.7627 

2.786 

1.06058 

1.1574 

1.300 

5 = 0.20 

Wcor, 5 

1.04681 

1.3322 

1.893 

0.95332 

0.9981 

1.111 


W s 

1.19209 

1.5282 

2.145 

1.04465 

1.1069 

1.229 

<5 = 0.30 

'Wcor,6 

0.93276 

1.0104 

1.226 

0.92657 

0.9375 

0.969 


W S 

1.06701 

1.1830 

1.451 

1.01572 

1.0387 

1.100 

<5 = 0.40 

'Wear.') 

0.87548 

0.8479 

0.899 

0.91026 

0.8918 

0.922 


W s 

1.00305 

0.9801 

1.040 

0.99397 

0.9995 

1.027 

R = S = 1000 





T 

1.00363 

0.9930 

1.023 

1.02000 

1.0359 

1.005 

6 = IT 

Wear 

1.03673 

1.2424 

1.659 

0.96959 

1.0172 

1.059 


w 

1.12922 

1.3821 

1.845 

1.03510 

1.0996 

1.155 

5 = 0.20 

'Wcor.r) 

1.00093 

1.1403 

1.433 

0.96411 

0.9991 

1.010 


W s 

1.09388 

1.2723 

1.607 

1.02742 

1.0754 

1.104 

<5 = 0.30 

W cor, 5 

0.94062 

0.9855 

1.095 

0.94886 

0.9643 

0.946 


W S 

1.03363 

1.1026 

1.228 

1.01472 

1.0400 

1.025 

<5 = 0.40 

'W'cor,S 

0.90898 

0.8968 

0.923 

0.94026 

0.9457 

0.894 


Ws 

0.99986 

0.9976 

1.041 

1.00667 

1.0146 

0.976 

R = S = 2000 





T 

1.00585 

1.0156 

1.034 

1.00851 

1.0144 

1.032 

6 = IT 

Wear 

1.00007 

1.1170 

1.368 

0.96987 

0.9840 

1.025 


w 

1.06733 

1.2046 

1.475 

1.01565 

1.0422 

1.099 

5 = 0.20 

'W'cor,S 

0.98525 

1.0636 

1.245 

0.96734 

0.9793 

0.988 


Ws 

1.04942 

1.1459 

1.345 

1.01188 

1.0290 

1.064 

<5 = 0.30 

W cor, S 

0.95502 

0.9889 

1.075 

0.96125 

0.9620 

0.990 


W S 

1.01906 

1.0649 

1.157 

1.00730 

1.0171 

1.032 

<5 = 0.40 

’W'coTjS 

0.93863 

0.9400 

0.973 

0.95732 

0.9536 

0.971 


W S 

1.00167 

1.0156 

1.052 

1.00206 

1.0008 

1.016 


Table 1: Type I error for the three tests for several values of <5, different genetic models, and 
different sample sizes. (ir = 0.15) 


to a standard normal distribution (concluded from QQ-plots and histograms of the p-values) 
and the power-function equals the maximum of the power-functions for T and W. For low 
sample sizes and/or low minor allele frequencies at the marker, the null distribution deviates 
from the standard normal distribution in just one tail. For all observations in that tail Q < 1 
(the test-statistic U equals W). This was also seen for the test-statistic, W and is therefore as 
expected. This problem can be easily solved by recalculating the small p-values in the tail by 
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permutations. 

The observations in the opposite tail all had Q > 1 (U equals T) and the fit with the standard 
normal distribution is good (as expected, since the null distribution for T is standard normal 
also if the sample size is low). The results of these simulation studies are not shown in this 
paper, because, to our opinion, the results are as expected and do not add much to the paper. 

4 Discussion 

Several methods have been proposed for selecting markers for follow-up in a case-control associ¬ 
ation study. One of the most popular test is the allele based test that considers the difference of 
marker allele frequency among cases and controls. A disadvantage of this test is its preference 
for markers with high minor allele frequencies; markers with low minor allele frequency have 
less power to be detected than markers with a high minor allele frequency. 

In this paper a new allele based association test for finding markers that are associated with 
a disease is proposed. The test is model-free and the test-statistic can be computed easily and 
fast. Moreover, the asymptotic null distribution of the test-statistic is known what makes the 
computations of p-values fast; permutations are not necessary. The power of the test is higher 
than for the commonly used test for many genetic models that are of practical interest. For 
those models, the lower the minor allele frequency is, the more power is gained by using the new 
test. This is mainly caused by the fact that the power of the proposed test is approximately 
constant as a function of the marker allele frequency; the proposed test does not favor markers 
with a high minor allele frequencies. So, ranking the markers based on their p-values becomes 
a more objective way of selecting interesting markers for follow-up studies and, because of its 
high power for markers with a low minor allele frequency (compared to the existing test), the 
proposed test may reveal interesting regions for follow-up study. 

The proposed test-statistic depends on the parameter 7 r, the prevalence of the disease. This 
parameter cannot be estimated based on the data itself, but should be estimated from external 
data. For most diseases, population risks are available, for instance from national registries. In 
the paper it is shown that the type I error is hardly affected by misspecification of 7r, because 
under the null hypothesis the parameter (almost) drops out from the test-statistic if the number 
of observations is large. In practice, it is quite common to estimate nuisance parameters in 
the model based on external data and to assume that these estimated nuisance parameters are 
known when performing statistical tests or constructing confidence intervals for the parameter 
of interest. The extra uncertainty due to the estimation of these nuisance parameter are often 
not taken into account what may lead to wrong type I errors, in general . In |4j the effect on 
the type I error for the likelihood ratio test-statistic is studied. 

We generalized the model by allowing other values for ir between zero and one. This may 
yield higher power for some markers, but the a priori preference for markers with high or low 
MAFs is back again. Moreover, by taking values near the boundary (zero or one), the type I 
error may increase above acceptable levels. We therefore advice to use ir as first choice. However, 
if 7T itself is low (near zero) and the sample size is not huge, the type I error may be inflated 
if the marker MAF is low. In that case one could decide to impute a higher value for 5 in the 
denominator of the test-statistic, so <5 > 7r, to give up some power and lower the number of false 
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discoveries. This was seen in the simulation study in this manuscript. 

In the derivation of the test-statistic W as well as in the simulation studies we assume that the 
alleles at a marker and causal variant are in Hardy-Weinberg equilibrium. However, deviations 
from Hardy-Weinberg equilibrium can inflate the chance of a false-positive association (j6j). In 
[6] an test-statistic that accounts for deviation from HWE is introduced; an extra term is added 
to the variance in the denominator of the test-statistic. Our test-statistic can be adjusted in 
a similar way, what makes the type I error of the test robust against deviation from Hardy 
Weinberg. 

When the Hardy-Weinberg proportions hold in the total population, the allele-based test 
T and the Cochran-Armitage Trend test for the additive model are asymptotically equivalent 
under the null hypothesis mm)- Since T and W are also asymptotically equivalent under the 
null hypothesis, this also holds for W and the Cochran-Armitage Trend test (CATT). Under the 
alternative hypothesis, the power functions differ. Based on a simulation study, m show that 
the power of the allele-based test T and the CATT for additive models are, nevertheless, very 
similar, with a slightly higher power for the allele-based test T under the recessive model and 
for the CATT under the dominant model (for genetic models they consider). For the genetic 
models for which the test based on W is more powerful than the test based on T, the test based 
on W is also more powerful than the CATT. 

Another association test is a score test based on a logistic regression model. The score test 
statistic equals the CATT ([15]). from which it directly follows from the previous paragraph that 
in many interesting settings, the allele-based test W is also more powerful than the score-test 
for a logistic regression model. 

For some genetic settings the test based on T is more powerful than the test based on W. We 
therefore combined the two test-statistics to a test-statistic U that always has an (asymptotic) 
power of at least the tests T or U. Although the test-statistic is more complicated now, it still 
has asymptotically a standard normal distribution under the null hypothesis and p-values can 
be easily obtained. 

A part of the missing heritability might be explained by causal variants with a low minor 
allele frequency. The test proposed in this paper, may help detecting a part of the undiscovered 
causal variants. 
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